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Abstract 

In statistical genomics, bioinformatics, and neuroinformatics, truth values of mul- 
tiple hypotheses are often modeled as random quantities of a common mixture dis- 
tribution in order to estimate false discovery rates (FDRs) and local FDRs (hFDRs). 
Unfortunately, the FDRs or hFDRs are typically reported with conventional confidence 
intervals or point estimates of a parameter of interest rather than shrunken interval and 
point estimates consonant with the hierarchical model that underlies LFDR estimation. 
In a pure Bayesian approach, the shrunken estimates may be derived from a fully speci- 
fied prior on the parameter of interest. While such a prior may in principle be estimated 
under the empirical Bayes framework, published methods taking that approach require 
strong parametric assumptions about the parameter distribution. 



The proposed approach instead extends the confidence posterior distribution to the 
semi-parametric empirical Bayes setting. Whereas the Bayesian posterior is defined 
in terms of a prior distribution conditional on the observed data, the confidence pos- 
terior is defined such that the probability that the parameter value lies in any fixed 
subset of parameter space, given the observed data, is equal to the coverage rate of the 
corresponding confidence interval. A confidence posterior that has correct frequentist 
coverage at each fixed parameter value is combined with the estimated LFDR to yield a 
parameter distribution from which interval and point estimates are derived within the 
framework of minimizing expected loss. The point estimates exhibit suitable shrinkage 
toward the null hypothesis value, making them practical for automatically ranking fea- 
tures in order of priority. The corresponding confidence intervals are also shrunken and 
tend to be much shorter than their fixed-parameter counterparts, as illustrated with 
gene expression data. Further, simulations confirm a theoretical argument that the 
shrunken confidence intervals cover the parameter at a higher-than-nominal frequency. 

Keywords: confidence distribution; empirical Bayes; high-dimensional biology; large-scale 
inference; local false discovery rate; multiple comparison procedure; multiple testing; predic- 
tive distribution; random effects model 



1 Introduction 



By enabling simultaneous tests of whether each of thousands of genes represented on a mi- 
croarray is differentially expressed across experimental or clinical conditions, advances in 
biotechnology have lead to increased use of the false discovery rate (FDR) as a solution to 
extreme multiple comparisons problems. As a result, the statistical community has devel- 



oped more general and more powerful methods of controlling what Benjamini and Hochberg 



(1995) called the FDR while proposing new definitions of the FDR (Farcomeni, 2008). The 



alternative strategy of estimating rather than controlling the FDR in turn led Efron et al. 



(2001 ) to propose estimating the local false discovery rate (LFDR), a limiting case of an FDR. 



Recently, Yanofsky and Bickel (2010) found LFDR estimators to perform well in terms of 



prediction error computed with gene expression microarray data, and Schwartzman et al. 



(2009) applied LFDR methods to the analysis of neuroimaging data. (The terminology here 
follows the empirical Bayes convention of referring to predictors of random quantities such 
as the LFDR as estimators.) 

FDR estimation begins with the reduction of the data directly bearing on each null 
hypothesis to a low-dimensional statistic such as a Student t statistic or a p-value and the 
specification of a subset of reduced-data space called the rejection region. In a general 
empirical Bayes framework, the Bayesian FDR (BFDR) is the conditional probability that 
a null hypothesis is true given that it is rejected, that is, given that its statistic lies in 



the rejection region (Efron and Tibshirani, 2002). Relaxing the requirement that all null 
hypotheses share the same rejection region and instead setting the rejection region of each 
null hypothesis to the set containing only the observed value of its statistic generates a 
different BFDR for each hypothesis; such a BFDR is called an LFDR. The LFDR of a null 
hypothesis is the conditional probability that it is true given that its statistic is equal to 
its observed value. Thus, estimates of the LFDR are often interpreted as approximations of 
fully Bayesian posterior probabilities that could have been computed were a suitable joint 
prior distribution of all unknown parameters available. 



However, from a hierarchical Bayesian perspective, the LFDR estimate suffers as an ap- 
proximation of a hypothesis posterior probability in its failure to incorporate the uncertainty 
in the parameters. Similarly, from a frequentist perspective, the point estimate of the LFDR 
would seem less desirable than an interval estimate of the LFDR since the latter would 
reflect uncertainty in the true value of the LFDR, and correlations between data of differ- 
ent biological features can introduce substantial variability into FDR and LFDR estimates 



(Bickel, 2004; Qiu et al. 2005). Efron (2010) addressed the problem of estimate accuracy 
by providing asymptotic bounds on the confidence limits of the FDR in the presence of 
correlation between statistics. Nonetheless, it is not clear how reporting a standard error 
or confidence interval for the LFDR of each of thousands of null hypotheses would facilitate 



the interpretation of the results (Westfall, 2010). 

Fortuitously, as the probability of hypothesis truth, the LFDR itself is of much less direct 
biological interest than is the random parameter about which a hypothesis is formulated. 
Both the Bayesian and frequentist criticisms that LFDR estimation inadequately incorpo- 
rates uncertainty in the parameter distribution may be answered by constructing conservative 
confidence intervals for the random parameters of interest under the finite mixture model 



that underlies LFDR estimation, as Ghosh (2009) accomplished for a mixture of two normal 
distributions. 

The assumption of a known parametric model for the random parameter will be dropped 
in Section [2j which instead uses a confidence posterior, a continuous distribution of confidence 
levels for a given hypothesis on the basis of nested confidence intervals. Like the Bayesian 
posterior, the confidence posterior is an inferential (non-physical) distribution of the pa- 
rameter of interest that is coherent according to various decision theories (Bickel, 2010a[b ). 
Unlike the Bayesian posterior, the confidence posterior does not require specification of or 
even compatibility with any prior distribution. The interest parameter 6 is a subparameter 
of the full parameter £, which specifies the sampling probability distribution Pg. In the case 
of a one-dimensional parameter of interest, the confidence posterior is completely specified 



by a set of nested confidence intervals with exact coverage rates. Given the observed realiza- 
tion x of a P^-distributed data vector X, the confidence posterior distribution P x is defined 
such that the probability that the parameter lies in a given interval [9', 9"} is equal to the 
coverage rate of the confidence interval equal to that given interval. That is, 

p*(#e[e',e"]) = Pz(eee p (x)) = p, (1) 

where $ is the random interest parameter of distribution P x , and p is the interval estimator 
with rate p of coverage constrained such that P (x) = [9', 9"]. To distinguish P x {d £ [9', 9"]) 
for a specified hypothesis that 9 £ [9', 9"] from a confidence interval of a specified confidence 



level p, Polansky (2007) called the former an observed confidence level. Marginalizing the 
confidence posterior over the estimated LFDR as the probability of null hypothesis truth 
shrinks the confidence posterior toward the parameter value of the null hypothesis. Shrunken 
interval and point estimates are then derived from the marginal confidence posterior. 

The use of the shrunken estimates will be illustrated in Section [3] with an application 
to gene expression data. Section [4] reports a simulation study of the shrunken confidence 
interval and point estimates. Section [5] closes with a summary of the findings. 

2 Frequentist posteriors for shrunken estimates 
2.1 Confidence posterior distributions 

Considering the observed data vector x £ X n as a sample from a distribution in the paramet- 
ric family {P^ : £ £ 2} parameterized by £ in H C IR d , the value in C IR 1 of the parameter 
of interest is denoted by 9(£). The function F, : X n x — > [0,1] is called a significance 
function if Fx {9) is distributed uniformly between and 1 for all 9 £ and if F x is a 
cumulative distribution function for all x £ X n . Due to the latter property, the significance 



function evaluated at x is also known as the confidence distribution (Fraser^ 1991, Singh 



et al. 2005), but Efron (1993) and Schweder and Hjort (2002) used that term in the sense 



of the following probability distribution. Given any x G X n , the confidence posterior P x is 
the probability measure on measurable space (0, £?(©)) of a random quantity $ such that 
P x (fi < 9) = F x (9) for all 9 G 0, where each B (0) is the Borel cx-field on 0. It is easy 
to verify that equation ([I]) holds for all £ G H and 9', 9" G and for any ^"-measurable 
function 0i_ ai _ a2 that satisfies 



0i-«i- 



IK 1 («i) , F- 1 (1 - a 2 



for every x G X n and every ol\ and a<i in [0, 1] such that a.\ + oli < 1. 

As a Kolmogorov probability measure on parameter space, the confidence posterior yields 
coherent decisions in the sense of minimizing expected loss, as does the Bayesian posterior, 
and yet without dependence on any prior distribution (Bickel, 2010a[b ). For example, the 
confidence posterior mean, minimizing expected squared error loss, is d x = J e ddP* ($), and 
the confidence posterior p-quantile, minimizing expected loss for a threshold-based function 



of p (Carlin and Louis, 2009, App. B), is $ (p) such that p = P x < $ (p)) . 



Example 1. Assume that Yj, the observable, log-transformed difference in levels of expres- 
sion of a particular gene between the jth individual of the treatment group and the jth 
individual of the control group, is a normally distributed random variable of unknown mean 
9 and unknown variance a 2 . For the observed differences yi, . . . ,y n G X = M, the n-tuple 
x = y n ) is thus modeled as a realization of X = (Y±, . . . , Y n ) , with Yj independent 

of Yj for all i ^ j. Then the one-sample t-statistic r (X) has the Student t probability dis- 
tribution of n — 1 degrees of freedom. The significance function F, and confidence posterior 
P x satisfy 



F x (9) = P x (■& < 9) = P (M (r(X)>r(x)) 



(2) 



for all 9 G E. 



Model %i G X n , the zth of m observed data vectors each corresponding to a gene or 
other biological feature, as a sample of with £j G H as the value of the full parameter 
and 8i = 9 as the value of the interest parameter. The ith null hypothesis asserts that 
8i = 6q, where 8q may be any specified value in 0. 



2.2 Empirical Bayes 

Empirical Bayes estimators of the LFDR flow from variations of the following hierarchical 
mixture model of a data set that has been reduced to a single scalar statistic per null 



hypothesis. Examples of such statistics include test statistics, p-values, and, as in Efron 



(2004), probit transformations of p- values. With an Af n -measurable map r : X n — > T, the 
observed statistic tj = r (xi) associated with the null hypothesis that Qi = 9q is assumed to 
be a realization of the random statistic Tj of the two-component mixture probability density 
function / such that 

/ it) = vro/o (*) + TTi/i it) (3) 

for all t 6 T, where ttq G [0, 1] , ~K\ = ttq — 1, and fo and f\ are probability density functions 
(PDFs) corresponding to the null and alternative hypotheses, respectively. As the unknown 
PDF of the statistic conditional on the alternative hypotheses, fi is estimated by some f\. 
Herein, fo is considered the known PDF of the statistic conditional on the null hypothesis, but 



it can instead be estimated if m is sufficiently large (Efron, 2004). The mixture distribution 
can be equivalently specified by Ja, where A is a random quantity equal to with probability 
7To and to 1 with probability tti. 

Let t = (ti, . . . ,t m ) and T = (T 1; . . . ,T m ). (Since Tj and Tj are identically distributed 
for all i,j G {!,..., m} under the mixture model ([3]), the model of Section 2.1 obtains 



conditionally for the random The local false discovery rate for the ith statistic is defined 



as the posterior probability that the ith null hypothesis is true: 

7To/o {U 



LFDR (tj) = P(Ai = 0|Tj = ti 



f(u) * 

It is estimated by replacing 7r and j\ with their estimates: 

I = ttq/o (U) 

7To/o (*) + (! -^o) A (*)' 

2.3 Extended confidence posteriors 

Marginalization over hypothesis truth leads to estimated posterior probabilities that each 
parameter of interest is less than, equal to, and greater than the parameter value of the null 
hypothesis. Such probabilities are coherent with each confidence posterior given the truth 



of the alternative hypothesis according to the confidence-based decision theory of Bickel 



(2010a) and Bickel (2010b) 



Consider the probability distribution pM Q f which each P Xi is a conditional probability 
distribution of i?j given 9i ^ 9q, of which Sg , the Dirac measure at 9 , is a conditional 
probability distribution of i?j given 9{ = 6q, and according to which l{ is the probability that 
6i = 6 . That is, pW (A { = 0) = £ { and, for all 9 e 0, 

P [i) {$i <9\A i = l) = P Xi (fy < 9) 

and, with the function I5 (•) respectively indicating membership and non-membership in S 
by 1 and 0, 

P (,) (#i < 9\Ai = 0) = 6 6o (fy <9) = 1 [0O)OO) (9) . 
In the more succinct mixture notation, 



0i~P« =e i 6g + {l-£ i )P Xi . 



(4) 



Since pW as the inferential parameter distribution follows from applying Kolmogorov prob- 
ability theory to the base distributions 7r., Sg , and P Xi , decisions made on its basis are those 
that would be required by the base distributions in the framework of minimizing expected 
loss with respect to a confidence posterior distribution (Bickel, 2010a|b ) and, more generally, 



with respect to any parameter distribution (e.g., von Neumann and Morgenstern, 1944; Sav- 



age 



1954). For example, as the posterior median F x } (1/2) minimizes the expected absolute 



loss involved in estimating 9i conditional on ^ = 1, the median t?j of P^ does so marginally. 
Thus, pW will be called the marginal confidence posterior and P Xi the conditional confidence 



posterior given the truth of the alternative hypothesis. Adapting the terminology of Polan- 



sky (2007) concerning fixed parameters of interest, pW-probabilities and P 2 ^ -probabilities of 



hypotheses will be called (observed) marginal and conditional confidence levels, respectively. 
Since 7r. is unknown, the marginal confidence posterior will be estimated by 



P (t) = + (i 



pXi 
I I 1 1 



(5) 



which resembles the marginal empirical Bayes posterior from which Ghosh (2009) derived 



conservative confidence intervals under a parametric model. (Efron (2008) similarly derived 



empirical Bayes interval estimates conditional on = 1 in order to contrast them with 



estimates that control a false coverage rate (Benjamini et al. , 2005).) The two posterior 



distributions differ in that P Xi is a confidence posterior rather than the Bayesian posterior 



P 



prior 



\Ai = 1, Tj = ti), which requires specification or estimation of P l 



prior 



\Ai = 1), a prior 



distribution of 9i conditional on the truth of the alternative hypothesis. For ease of reading, 
pW-probabilities of hypotheses will be called (observed) marginal confidence levels even 
though they are more precisely estimates of such levels. 

Example 2. Generalizing Example [T] to multiple genes, let Xj denote the n-tuple of log- 
transformed differences in levels of expression of the ith of m genes. The ith null hypothesis 
is that the ith gene is equivalently expressed (0j = 0) as opposed to differentially expressed 



(6i 7^ 0). Further, let P Xi denote the corresponding confidence posterior defined by equation 
Q and the normality and conditional independence assumptions of Example [I] That P Xi is 
mathematically equivalent to the Bayesian posterior P pr i r {*\Ai — 1, 21 = U) formulated by 
the uniform "distribution" (Lebesgue measure) as the prior for 9i and integrating over the 
standard deviation a with respect to the posterior from the independent prior density pro- 
portional to 1/a. Since the prior is not a Kolmogorov probability distribution, the estimated 
posterior odds given by multiplying the estimated prior odds (1 — ttq) /ttq by the Bayes factor 



is undefined (Berger and Pericchi, 1996, Yanofsky and Bickel, 2010). Thus, there is no prior 



distribution that corresponds to Pw in this example. (Yanofsky and Bickel (2010) used a 



predictive distribution on the basis of an intuitively motivated posterior equivalent to PW 
to assess the performance of various predictors of gene expression data.) 



2.4 Point and interval estimates 

Were the marginal confidence posterior pW known, its mean and median would respectively 
minimize expected square-error and absolute loss incurred by estimating 9i (S |2.1[ ), and the 
odds for betting that 0* lies in some subset 0' of would be pW G 0') / P {i) G 0\0'), 



a ratio of two observed marginal confidence levels (Bickel, 2010a). 



Those decision-theoretic considerations suggest estimating 9{ by the mean or median of 
pW and constructing the (1 — a% — 0^2) 100% marginal confidence interval P^ 1 («i) , P^ 1 (1 — 0^2) 
from the marginal significance function Frq defined by Fm (9) = PW < 9) for all 9 G 0. 
Inverting Pm gives, for any a G [0, 1], 



(«) 



F- 1 (l-(l-a)/(l 



ifP" 1 (a/ (1-4) J <0o 
if^ 1 (l-(l-a)/(l- 
otherwise 



>0n 



(6) 



where F x . is the conditional significance function defined in Section 2.1 



While the P^-probability that i?^ lies in the interval estimate is exactly (1 — ax — a^) 100% 
by construction, it does not have exact frequentist coverage. However, two limiting cases 
suggest that the marginal confidence interval covers the random value of 8i at a relative 
frequency greater than the nominal rate p — (1 — a.\ — a^) 100%: 

lim P true (Oi < (a) \Ai = 1, h < A) = P tme (F x . (6,) < a\A t = 1) = a (7) 



lim P true (9i < {a) \A = 0, 4 > a) = P trU e (0< = 9 \ M = 0) = 1, 



where < a < 1 and < A < 1; Pt rue is the sampling distribution of (0$, To the 

extent that 1 — 7To is small, the actual coverage rate Pt m e £ P^") 1 ( a i) > -^(T) 1 (1 — Q 2) j is 
dominated by the rate conditional on A4 = 0. For that reason, equation (Jsj> indicates that, 
inasmuch as 7To is a positively biased estimator of some sufficiently small ttq, the confidence 
intervals derived from pW are conservative in the sense that they include the random value of 
Bi at a relative frequency higher than the nominal (1 — oti — 0:3) 100% level for any a±, 0:2 € 
[0, 1] such that oti + 012 < 1. 

Likewise, the P^-posterior median P^ 1 (1/2), corresponding to the degenerate 0% con- 
fidence interval P^ 1 (50%) , P^ 1 (50%) , is conservative in the sense that a positive bias in 
ii pulls the estimate P^ 1 (1/2) toward 9q. The extent of the conservatism of both point and 
interval estimates was quantified by simulation as described in Section |4j 



3 Application to gene expression 

Microarray technology enables measurement of the expression levels of thousands of genes 
for each biological replicate, an organism or set of organisms studied. Most microarray 
experiments are designed to determine which genes to consider differentially expressed across 
two conditions, conveniently called treatment and control. Investigators initially relied on 



estimates of an average ratio of expression under the treatment condition to that under 
the control condition without using hypothesis tests. As statisticians have responded with 
extensive research on multiple comparison procedures, biologists have moved to ignoring 
estimated levels of differential expression for all genes that do not correspond to rejected 
null hypotheses. 



In response, Montazeri et al. (2010) proposed the prioritization of genes for further study 



by shrunken estimates of differential expression levels, much as Stromberg et al. (2008) and 



Wei et al. (2010) suggested prioritizing single-nucleotide polymorphisms (SNPs) by shrunken 



estimates of odds ratios. Whereas Montazeri et al. (2010), following Bickel (2008) and Yanof- 



sky and Bickel (2010), used a heuristic estimate equal in value to the posterior mean with 



respect to the posterior median has the advantage of invariance to reparameterization. 
Since, in addition, the posterior median is a limiting case of a confidence interval (S |2.4[ ), it 
will be used as the point estimate alongside the interval estimate. 

While point estimation is practical for ranking genes in order of priority, interval estimates 
are needed to quantify their reliability. In place of the commonly used confidence intervals 
that do not account for multiple comparisons, we will report the shrunken confidence intervals 
of equation (|6]). 

The amount the gene expression differs between mutant tomatoes and wild type (WT) 
tomatoes were estimated for n = 6 mutant- WT ("treatment-control") pairs at 3 days after the 
breaker stage of ripening; the microarrays represent 13,440 genes. |Alba et al.| ( |2005| > provide 
details of the fruit development experiments conducted. Due to the pairing of mutant and 
WT biological replicates, the normal model and confidence posteriors of Examples [T] and [2] 
were used. Each local false discovery rate was estimated by the "theoretical null" method 



of Efron (2007) since simulations indicate that the "empirical null" method applied to the 
model of [T] and [2] loses power in the presence of heavy-tailed data like that of gene expression 
( |BickelH20l0bl ). 

Each circle of Fig. [T] represents a point or interval estimate of 0, for a gene. The left-hand 



side displays the posterior median of i?j versus li on the basis of each marginal confidence pos- 
terior pW (black) and each conditional confidence posterior P Xl (gray). Stronger shrinkage 
is evident at higher values of i{. 

The right-hand side of Fig. [I] features the width of the confidence interval from each pW 
versus the width of the confidence interval from each P Xi . It is apparent that the use of 
the marginal confidence posterior in place of the conditional confidence posterior tends to 
substantially reduce interval width. 

In Fig. [2j observed marginal confidence levels are plotted against observed conditional 
confidence levels to show how much inferential probability each attributes to the hypothesis 
that 9i < and to the hypothesis that 9% > 0. The horizontal axis has P Xi ($j < 0), which 
is equal to pW (^ < 0\Ai = 1) and to 1 - P Xi (#< > 0). The vertical axis has pW (^ < 0) 
in black and pW > 0) in gray; these marginal confidence levels do not total 100% since 
pW (0< = 0) = u > 0. 



4 Simulation study 

Levels of gene expression and corresponding observations were simulated for 2000 gene ex- 
pression experiments each with tt = 90% probability that any gene is equivalently, n = 2 
observations per gene, and m = 10 4 genes, as follows. For each experiment, the mean differ- 
ential expression levels 9i, . . . , 9 m were independently assigned with probability ttq, —2 with 
probability (1 — 7r ) /2, and +2 with probability (1 — tt ) jl. Then, for each i = 1, . . . ,m, 
the n observed expression levels were independently drawn from N(9i,af), where Cj = 1 if 
9i — and Cj = 3/2 if 9i ^ 0, in accordance with Examples [l] and [2j The posterior medians 
P- 1 (50%) and P^ 1 (50%) and the 95% confidence intervals [P^ 1 (2.5%) , P^ 1 (97.5%)] and 
P^ 1 (2.5%) , P(7) X (97.5%) were computed for each simulated experiment. A total of 2000 
experiments were thereby simulated and analyzed. To assess dependence on the proportion 
of true null hypotheses, all of the simulations and analyses were repeated for 7r = 99% using 



the same seed of the pseudo-random numbers. 

Fig. ^consists of histograms of the posterior median errors F^ 1 (1/2) — 9i (black) and 
F" 1 (1/2) — 9i (gray) according to each marginal confidence posterior P® and each condi- 
tional confidence posterior P Xi , respectively. The left panel corresponds to 1 — 7r = 10% 
and the right panel to 1 — 7i"o = 1%. 

Fig. [4] gives the width of the confidence interval from each pW versus the width of the 
confidence interval from each P Xi for 1 — ttq = 10% (left) and 1 — tcq = 1% (right). Each circle 
corresponds to a simulated experiment. As seen in the application to gene expression (Q, 
the marginal confidence intervals tend to be much shorter than the conditional confidence 
intervals. 

The smaller intervals do not compromise frequentist coverage. On the contrary, the 
confidence intervals from pw cover the simulated values of 0i at rates higher than the 
nominal 95% level (Table [T]), in agreement with equations ^ and 

5 Discussion 

As an extension of both a confidence posterior and an empirical Bayes posterior, PW offers 
new approaches to two related problems in high-dimensional biology. First, the problem 
of prioritizing biological features for further study was addressed by ranking the features 
according to their pW-posterior means or medians. Since a point estimate for an individual 
feature of scientific interest is difficult to interpret without an indication of its reliability, 
the problem of reporting interval estimates consonant with the point estimates was handled 
by constructing the confidence intervals of each feature to have the posterior median at its 
center. The next two paragraphs summarize the findings relevant to each proposed solution 
in turn. 

The posterior median of PW is suitable for ranking features in order of priority or interest 
since it is parameterization-invariant and since it adjusts the uncorrected parameter estimate 



according to statistical significance as recorded in the LFDR. The commonly used alternative 
of using the LFDR or other measure of significance to make and accept-reject decision 
followed by conventional estimation of the parameter does not perform well since it depends 



on an arbitrary threshold to distinguish acceptance from rejection (Montazeri et al. , 2010). 



The simulations show that the posterior median of PW does perform well in terms of hitting 
or coming close to its target parameter value (Fig. [3]). 

The confidence intervals based on pW are not only centered at the estimates recom- 
mended for ranking features, but also tend to be much shorter than the fixed-parameter 
confidence intervals on which they are based, as seen both in the application to gene expres- 
sion (Fig. [TJ and in the simulation study (Fig. [1]). In spite of their shortness, the shrunken 
confidence intervals cover their target parameter values at rates higher than those claimed 
(Table [If. 

Some caution is needed in interpreting pW < # ) ; pw (^ = # ) ; pW (^ > # ) ; an d 
other observed marginal confidence levels as posterior probabilities for decision-making pur- 
poses. Since the LFDR estimate l{ is conservative in the sense that it has an upward bias 



(Pawitan et al. 



2005 



Yang and Bickel, 2010), the P^-probability of any hypothesis that 



includes or excludes 6 will tend to be too high or too low, respectively. For example, there 
is no warrant for concluding from pW (i?j = 9q) = h = 100% that the null hypothesis is 
true with absolute certainty ( |Bickel 2010b, Yang and Bickel, 2010), and the observed con- 



ditional confidence of Fig. 2 and studied by Bickel (2010b) would thus perform better in 



terms of logarithmic loss or other scoring rules that infinitely penalize predicting an event 
with certainty that does not occur. 
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Figure 1: Point and interval estimates of gene expression. 
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Figure 2: Observed confidence levels about mean differential expression. 
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Figure 3: Point estimate performance. 




Figure 4: 95% confidence interval performance. 





1 - 7T = 10% 
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Marginal confidence 


97.5% 


99.2% 


Conditional confidence 
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Table 1: 95% confidence interval coverage. 



